Social Media: a Text Classification Approach
نویسندگان
چکیده
The emergence of social media has brought up plenty of platforms where dissatisfied customers can share their service encounter experiences. Those customers’ feedbacks have been widely recognized as valuable information sources for improving service quality. Due to the sparse distribution of customer complaints and diversity of topics related to non-complaints in social media, manually identifying complaints is timeconsuming and inefficient. In this study, a supervised learning approach including samples enlargement and classifiers construct was proposed. Applying small labeled samples as training samples, reliable complaints samples and non-complaints samples were identified from the unlabeled dataset during the sample enlargement process. Combining the enlarged samples and the labeled samples, SVM and KNN algorithms were employed to construct the classifier. Empirical results show that the proposed approach can efficiently distinguish complaints from non-complaints in social media, especially when the number of labeled samples is very small.
منابع مشابه
Rough Set Techniques for Text Classification and Sentiment Analysis in Social Media
Sentiment Analysis (SA) is an ongoing research in the field of text mining and classification. SA finds a computational domain from opinions and subjectivity of text data in online social media. Sentiments are inherited in the form of simple lexicons with symbols and texts having noise of irregular texts in complex forms. It is also seen that the high dimensional growth of lexical blends used b...
متن کاملHigh capacity steganography tool for Arabic text using 'Kashida'
Steganography is the ability to hide secret information in a cover-media such as sound, pictures and text. A new approach is proposed to hide a secret into Arabic text cover media using "Kashida", an Arabic extension character. The proposed approach is an attempt to maximize the use of "Kashida" to hide more information in Arabic text cover-media. To approach this, some algorithms have been des...
متن کاملTopic Modeling and Classification of Cyberspace Papers Using Text Mining
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...
متن کاملNamed Entity Recognition for Code Mixing in Indian Languages using Hybrid Approach
Automating the process of Named Entity Recognition has received a lot of attention over past few years in Social Media Text. Named Entities are real world objects such as Person, Organization, Product, Location. Identifying these entities in social media text is an important challenging task due the informal nature of text present on social media. One such challenge that is faced in recognizing...
متن کاملUser Classification with Multiple Textual Perspectives
Textual information is of critical importance for automatic user classification in social media. However, most previous studies model textual features in a single perspective while the text in a user homepage typically possesses different styles of text, such as original message and comment from others. In this paper, we propose a novel approach, namely ensemble LSTM, to user classification by ...
متن کاملAuthor gender identification from text using Bayesian Random Forest
Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...
متن کامل